OPTIMAL CO-ADAPTED COUPLING FOR A RANDOM 
WALK ON THE HYPER-COMPLETE-GRAPH 



By Stephen B. Connor 
University of York 

Abstract Let Gd be the complete graph with d vertices, and let 
X and Y be two simple symmetric continuous-time random walks on 
the vertices of G^. When d — 2, X and Y are random walks on the 
hypercube , for which a stochastically fastest co-adapted coupling 
is described in (0). Here we extend this result to random walks on Gj, 
once again producing a stochastically optimal coupling: as d — > oo 
we show that this optimal co-adapted coupling tends to a maximal 
coupling. 



1. Introduction. Let Gd be the complete graph with d vertices, d > 2, 
and let be the set of n-tuples of the form (x(l), . . . , x(n)) with x(i) £ Gd, 
1 < % < n. Gj forms a group under coordinate-wise addition modulo d, and 
G% = 7*2 ■ F° r i = 1, • • • , n we define ej G G 1 ^ to satisfy ej(/c) = 1 where 
1[.] denotes the indicator function. For x,y £ G^, let \x — y\ denote the 
Hamming distance between x and y. (In particular, \x\ equals the number 
of non-zero coordinates of x.) 

Let Aj, 1 < i < n, be independent unit-rate marked Poisson processes 
on [0, oo), with marks distributed uniformly on the set {1, ...,d— 1}. A 
simple symmetric continuous-time random walk X on G^ may be defined 
by increasing the i th coordinate of X by k (mod d) at incident times of Aj 
for which the corresponding mark is equal to k. We write C(Xt) for the law 
of X at time t. The unique equilibrium distribution of X is the uniform 
distribution on G^- 

Suppose that we now wish to couple two such processes, X and Y, starting 
from different states. 
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Definition 1.1. A coupling of X and Y is a process (X c , Y c ) on G^xG^ 
such that 

x c = x and y c = y. 

In this paper we will be primarily concerned with co-adapted couplings: 

Definition 1.2. A coupling (X C ,Y C ) is called co-adapted if there exists 
a filtration (^ r t) t>0 such that 

1. X c and Y c are both adapted to {^Ft)t>o > 

2. for any < s < t, 

£(X?\F S ) = £(X?\X C S ) and C (Yf \ F s ) = C (Y t c \ Y s c ) . 

In other words, (X C ,Y C ) is co-adapted if X c and Y c are both Markov 
with respect to a common filtration, {^t)t>o- We denote by C the set of all 
co- adapted couplings of X and Y. For t > 0, let 

£/? = {l<*<n : X£(t)^ and Mf = {1 < i < n : X c t (i) = Y t c (i)} 

respectively denote the set of unmatched and matched coordinates at time 
t. We define the coupling time by 

r c = inf {t > : X c s = Y s c Vs > t} = mf{t > : U c s = V s > t} . 

If (X c , Y c ) is co-adapted then r c is a randomised stopping time with respect 
to the individual chains. 

In (0), an explicit, intuitive coupling strategy is described when d = 2, and 
is shown to yield the stochastically minimal coupling time of all co-adapted 
couplings. This coupling strategy at time t depends only on the parity of 
At = \Ut\, and may be summarised as follows: 

• matched coordinates are always made to move synchronously; 

• if A is odd, all unmatched coordinates of X and Y are made to evolve 
independently until A becomes even; 

• if A is even, unmatched coordinates are coupled in pairs - when an 
unmatched coordinate on X flips (thereby making a new match), a 
different unmatched coordinate on Y is flipped at the same instant 
(making a total of two new matches) . 



The work of (|2j) motivates the following question: what is the optimal 
co-adapted coupling when d > 2? Intuitively, we expect the optimal strat- 
egy when d = 2 to become inefficient as d gets large: the rate at which 
unmatched coordinates can be made to agree using either 'independent' or 
'pairwise' coupling (as described above) is proportional to N/d. In Sections [2] 
and [3] we show how to describe the problem of finding an optimal co-adapted 
coupling as an exercise in optimal stochastic control (generalizing the idea 
used in (0)), and solve this problem to once again produce a stochastically 
minimal coupling time; some of the longer proofs can be found in Section [5j 
In Section H] we study the behaviour of this coupling as d — > oo and show 
that, for fixed n, the optimal co- adapted coupling tends to a maximal cou- 
pling. 

2. Co-adapted couplings for random walks on G 1 ^. In order to 
find the optimal co-adapted coupling of X and Y, it is first necessary to be 
able to describe a general coupling strategy c € C. To this end, let A-(i,k)(j,l) 
( 1 < i , j < ri and 1 < k, I < d) be independent marked Poisson processes on 
[0, oo), each of rate (d — 1) _1 , and with marks Wn^u^ Uniform[0, 1]. 
We let (J r t)t> be any filtration satisfying 



a \ U H,k)ud s )> U w (i,k)m( s ) ■ s<t\cj^ u vt > o. 

^ i,j,k,l ) 

The transitions of X c and Y c will be driven by the marked Poisson processes, 
and controlled by a process {Q c (i)lt>o wn i cri is adapted to (3~t) t> Q- Here, 



Q C (t) = {?(r,s)(*) : l<r,s<nd} 



is a (nd) x (nd) doubly-stochastic matrix. 

A similar argument to that in (0) shows that a general co-adapted coupling 
for X and Y may be defined as follows: if there is a jump in the process 
A-(i,k)(j,l) a t time t > 0, and the mark Wuk)(j : i)(t) satisfies W(i,fc)(j < 
Qffj-iu+fc u_i],2 + n(*)> then set X£(i) = k and ^"/(j) = I. To ease notation, 
in the sequel we shall write q? ik \un(i) instead of (ffuy^u^u+fiit): thus 
1{ik) i s proportional to the instantaneous rate at which (Xf(i),Y t c (j)) 

jumps to (A;, I). 
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Note that if k = X£_(i) and the 'move' is accepted, there is no change to 
the value of X c (i) at time t: thus this setup allows for the possibility of a 
jump taking place on only one process at any given instant. Furthermore, 
the total rate at which X£(i) changes value is given by 



since the bracketed double sum is simply that of the ([i — l]d + k) row of 
Q c (t), and hence equal to one. Similarly, the rate at which X c (i) changes 
from r to r + k (mod d), 1 < k < d — 1, is equal to 

1 n d 1 
j=i i=i 

From this construction it follows directly that X c and Y c both have the 
correct marginal transition rates to be continuous-time simple random walks 
on as described in Section [[J and are co-adapted. 

3. Optimal coupling. Our proposed optimal coupling once again 
depends upon the parity of Nt, the number of unmatched coordinates of X 
and Y at time t. It now also depends upon how this number relates to the 
parameter d. 

Definition 3.1. The matrix process Q corresponding to the coupling 
Cd is as follows: 

[CI] <i(i : k)(i,k)(t) = 1 f° r an * ^ Mt and all k = 1, . . . , d; 



(h) q(i,k)(i,k)(t) - l[fc^X t _(i),fc^Yt-(i)]' 

[C3] if N t is odd and N t < 2(d - l)/(d - 2), then % )fc ) (i;fe )(t) = 1 for all 
i G Ut and all fe = 1, . . . , d. 

Part [CI] of this definition ensures that no matches are ever broken under 
Cd- The final two items define the strategy for making new matches. If Nt is 




[C2] 



if Nt is even, or N t > 2(d - l)/(d - 2): for i,j G U t 
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even, or else sufficiently large, we will see that [C2] (i) implies that the rate at 
which two new matches are made is maximised; [C2](ii) then maximises the 
rate at which a single new match is made, subject to the constraint imposed 
by [C2](i). Finally, [C3] implies that if N t is odd, with N t < 2(d- l)/(d-2), 
the coupling maximises the rate at which single matches are made. (Note 
that if d = 2, [C3] applies whenever Nt is odd; if d = 3 then it applies when 
N e {1,3}; while if d > 4, [C3] applies only when N t = 1.) 

Informally, Cd couples X and Y as follows when d > 4 and Nt > 2. If an 
unmatched coordinate X(i) jumps to state k at time t, then: 

• if Yt-(i) = k (which occurs with probability 1/d), choose another 
unmatched coordinate j uniformly at random, and set Yt(j) = Xt~(j) 
- this decreases N by two; 

• if Yt-(i) 7^ k, set Yt(i) = k - this decreases N by one. 

Now define 



to be the tail probability of the coupling time under dd- The main result of 
this paper is the following generalisation of |3, Theorem 3.1). 

Theorem 3.2. For any states x,y e and time t > 0, 



In other words, fa is the stochastic minimum of all co- adapted coupling times 
for the pair (X,Y). 

From Definition 13.11 it is evident that is invariant under coordinate 
permutation, and that Vd(x,y,t) only depends on (x,y) through \x — y\, 
and so we shall usually write 



(3.1) 



v d (x, y, t) = P [f d > 1 1 X = x, Y = y] 



(3.2) 



v d (x, y, t) = inf P [t c > 1 1 X = x, Y = y) . 



v d {m, t) = P [fd > 1 1 N = m] 



with the convention that v d (m,t) = for m < 0. 
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As in (j2|), we shall write A£(m, m + s) for the rate (according to Q c (t)) at 
which N£ jumps from m to m + s, for s 6 {—2, . . . , 2}. For example: 

(3.3) 

(d - l)X c t {m,m- 2)= 9(t,fc)( 3 -,i)( t ) 1 [*=*t-(i),i=Jfi-C7)] 

ij'61/t 

and 

(d-l)A t c (m,m-l)= 2 E 9ftfc)(i,*)(*) 

ieU t l<k<d 



+ E E 9(i,fc)(j ) I)(*)( 1 [fc=l r t-(i),¥^*-(j)]+ 1 [Myt 



t-(i),l=X t _(j)} 



(3.4) 



ijeff l<l,k<d 

+ £ E 9(i, fe )( J y)(*)l[fe=y t _(i),z=y t _(j)] 

ief/t l<l,k<d 

jeM t 



+ E E ffCi.fcJCj.i)^) 1 ^*-©.^*-^)] 

iGMt l<Z,fc<d 



The expression for A£ (m, m — 2) is easy to understand: Nf decreases by two 
if and only if different unmatched coordinates on X and Y flip at the same 
instant, with each flip making one new match. 

Af (m, m — 1) comprises four sums however, and so requires a little more 
explanation. The first sum in (|3.4p gives the rate at which the same un- 
matched coordinate flips on both X and Y to the same value, making one 
new match. The second term is the rate at which an unmatched coordi- 
nate on one process flips to make a new match, while a different unmatched 
coordinate on the other process flips without making another new match. 
Finally, the third and fourth sums in (|3.4p give the rate at which an un- 
matched coordinate on one process flips and makes a new match, while on 
the other process a matched coordinate is selected and made to stay at its 
current value. 

Similar decompositions may be written down for A£(m, m+1) and A£(m, m+ 
2), but we will have no need of them in the sequel. 



Using the constraints on the row and column sums of Q\ , the first of these 
terms may be bounded as follows: 

(d-l)X c t (m,m-2) = J2 9(i,fc)(j,o(*) 1 [*=yt-(i)] 1 [I=X t _(7)] 

(n d \ 
EEtaW )=\Ut\=m. 
j=i 1=1 J 



Similar simple bounds on the sums in (|3.4p show that 
< (d — l)X1(m, m — 1) < md. Furthermore, 

(d - l)X c t (m, m - 1) + 2(d - l)X c t (m, m-2) 

^ J2 (S9(i,fc)(jM)( t ) 1 [*^t-W,^t-(i)]) 
ieu t \k=i / 

(n d 
EE9Wb)(j,l)W + 1 y=i,l=l't-(i)] 

j=l j=i 

(n d 
i=i fc=i 

< m(d — 2) + m + m = md . 
Denote by L 1 ^ the set of nonnegative A satisfying the linear constraint 
(3.6) (d - l)A(m, m - 1) + 2(d - l)A(m, m-2) <md. 



When d = 2 this reduces to the constraint of (J2J): A(m, m— l)+2A(m, m—2) < 
2m. 

Proposition 3.3. Under Cd the following set of equations hold: 

(3.7) X £ t d {m,m + 1) = X^ d (m,m + 2) = 0; 

i/ m is even, or if m > 2(d — l)/(d — 2) then 

(3.8) (d- l)X £ t d (m,m - 2) = m and (d - l)X £ t d (m, m - 1) = m(d - 2) ; 
if m is odd and m < 2{d — l)/(d — 2) then 

(3.9) Aj d (m, m — 2) = and (d — l)Aj d (m, m — 1) = md . 
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(See Section [5] for the proof.) It follows that the upper bound of (|3.6|) 
is always attained under dd- Although the framework laid out above for 
describing a general coupling c G C differs from the setup in (0), we can 
immediately obtain the result of that paper: 



COROLLARY 3.4. Theorem \37M holds when d = 2. 

Proof. When d = 2, Proposition 13.31 shows that for all m G N: 

\ c t d (m,m + 1) = \ c t d (m,m + 2) = 0, 
Xt d (2m, 2m - 2) = 2m, and \ t t d (2m - 1, 2m - 2) = 2(2m - 1) . 

The optimality of c d when d = 2 now follows from the proof of Theorem 3.1 
of 0). □ 

For a strategy c G C, define the process 5f by 

^ = ^(A7,Y/\T-t) , 

where T > is some fixed time. This is the conditional probability of X and 
Y not having coupled by time T, when strategy c has been followed over the 
interval [0, t] and q has then been used from time t onwards. 
Now, (point process) stochastic calculus yields: 

(3.10) dSf = dZ c t + [A c t v d - ^fj dt , 

where Z£ is a martingale, and A% is the "generator" corresponding to the 
matrix Q c (t). Due to the independence of the Poisson processes A^ t k)(j,i), 
for any function f : G% x x R + — ► R, A% satisfies 

A c t f{x, y, t) = -^—^ Yl ZZ Vkmi)^ [f^+[k-x{i)]e i ,y+[l-y(j)}e j ,t)-f(x, y, t) 

i,j k,l 

Since Vd is invariant under coordinate permutation, if \x — y\ = m then 

2 

A%v d (m,t) = ^2 \t(m,m + s){v d {m + s,t) -v d (m,t)] . 

s=-2 
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As in the optimality of &d will follow by Bellman's principle (Q) if it can 
be shown that Sf Ar c is a submartingale for all c G C (where sAt = min {s, t}). 
This in turn follows if we can show that A%v& is minimised by setting c = &d 
(since Af d Vd — dvd/dt = 0). Thus we seek to maximise over A G L^, for all 
m > and all t > 0, 

2 

(3.11) ^ A(m, to + s) [v d (m, t) - v d (m + s, t)\ . 

s=-2 

Our first step is to simplify this maximisation problem by showing that 
{)d(m, t) is strictly increasing in m. Define rd(m, t) = Vd{ m , t) — Vd(m — l,t). 
The proof of the following result can be found in Section [5l 

Lemma 3.5. The tail probability Vd{m,t) is strictly increasing in m. 

Thus rd(m,t) > for all m > and t > 0. It follows that the terms 
appearing on the right-hand-side of equation (|3.1ip are nonpositive if and 
only if s is nonnegative. Hence we must set 

A(m, m + 1) = A(m, m + 2) = 

in order to achieve the maximum in (13. lip It therefore now suffices to max- 
imise 

A(m, m — 2) [v^m, t) — Vd{ m — 2, t)] + A(m, m — 1) [urf(m, t) — ^(m — 1, t)] 
= A(m, m — 2) [rrf(m, t) + r^(m — 1, t)] + A(m, m — l)rd(m, t) 

subject to the constraint from ()3.6j) : 

(d - l)A(m, m - 1) + 2(d - l)A(m, m-2) <md. 

Putting these together we see that we need to maximise (for m > 2) 

(3.12) A(m, m — 2) [r^(m — 1, t) — r^(m, £)] . 

Theorem 3.6. For all d > 4, m > 2 and t>0, r d (m - 1, £) > r d (m, t). 

(See Section [5] for the proof.) This result allows us to finally complete the 
proof of Theorem 13.21 when d > 4. (The proof for d = 3 follows a similar line 
of argument, using an amended version of Theorem 13.61 ) 
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Proof of Theorem 13.21 (d > 4). 
When d > 4, Theorem 13.61 and the preceding discussion show that the op- 
timal strategy must maximise \{m, m — 2) for m>2. Using the bounds in 
(|3.5f) and (j3.6l) . it follows that this is equivalent to requiring 



(d — l)A(m, m — 2) = m and (d — l)A(m, m — 1) = m(ci — 2) . 

But by Proposition 13.31 this is in complete agreement with the rates X c t d 
arising from using our candidate optimal strategy, &d- Thus q is truly an 
optimal co-adapted coupling, as claimed. □ 

4. Limiting behaviour. In this section we briefly consider the limiting 
behaviour of the coupling time fd of the optimal co-adapted coupling, both 
as n — > oo and d — > oo. Recall that the coupling inequality bounds the 
tail distribution of any coupling of X and Y by the total variation distance 
between the two processes (0): 

(4.1) \\C(X t )-C(Y t )\\ TV <F[r>t] . 

Furthermore, due to the general results of (0; fiol ). there exists a maximal 
coupling c* - one whose coupling time r* achieves equality in (|4.ip . Such a 
coupling is not, in general, co-adapted. A natural question is whether the 
optimal co-adapted coupling for (X, Y) described in the previous section is 
also maximal. (This was answered in the negative by Connor & Jacka (0) 
when d = 2.) 

First suppose that d is fixed. Denote by TVd the uniform distribution on 
(recall that TTd is the equilibrium distribution of X), and by the coupling 
time of a maximal coupling for the chains (X, Y) where Xq = and Yq ~ i^d- 
The following result is a simple generalisation of (0, Proposition 1): 

Lemma 4.1. Let 

Trf= KV) iogn - 

Then as n — ► oo, for all 

(4.2) \\C{X Td+e ) - vr d || TV = 2$ e -*/C*-i)j - 1 + (1) , 
where $(•) is the standard normal distribution function. 
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This shows that the distance between C(X) and ir^ exhibits a cutoff phe- 
nomenon (Q; 0; 0) at time T^. Thus E [r|] ~ < | log n. 

We can bound E [f^] as follows. Under c, N = N c is a decreasing process, 
with jumps being of size -1 or -2. Suppose that N = 2m and hence is even. 
Then the total rate at which N jumps is equal (by [C2]) to 

(4.3) Af (2m, 2m - 2) + Af (2m, 2m - 1) = + 2m j^_~ ^ = 2m = iV . 

For = 1,2, let M k be a process that takes only steps of size —k at 
rate M k , and let r k be the time taken for M k to be absorbed at zero. 
If N = M fc = 2m then E [r 2 ] < E [f d ] < E [r 1 ] , thanks to Lemma [331 
Furthermore, 



E 



r fc |M fc = 2m] = Y^i^y 1 ~ rlogm. 



k 
i=i 



Averaging over the starting state of Yq we see that 

— log n < E [fij] < log n as n — > cxo . 

Therefore E [fj > E [rj], and so the optimal co-adapted coupling is not 
maximal for any fixed d. 

Let us now consider what happens if we let d — > oo while keeping n fixed. 
Suppose that the d points of are equally spaced on the unit interval [0, 1), 
at locations {0, d~ l , d~ 2 , . . . }. As d — ► oo the random walk X on G^, with 
Xq = 0, converges in distribution to the random walk X on [0, l) n for which 
each coordinate jumps, at incident times of an independent unit-rate Poisson 
process, to a new location distributed uniformly on [0, 1). The equilibrium 
distribution of X is of course tt^ = Uniform [0, l) (Xln . Let Yq ~ 7r^. 

Lemma 4.2. For n fixed, as d — > oo the optimal co-adapted coupling Cd 
of Section [3 tends to a maximal coupling. 



Proof. Let Aq be the set of points in [0, l) n which have at least one 



12 



STEPHEN B. CONNOR 



coordinate equal to 0. Then, by definition of total variation distance, 



TV 



sup 



X* E A 



Ac[0,l) n N 

x t e A) 

= 1 — P all coordinates of X have jumped by time t 
= l-(l-e-*) n . 

Now consider the optimal co-adapted coupling strategy c& as d — > oo. 
From (13.81) we see that 



1) 



as d 



oo, 



with all other rates tending to zero. Thus the limiting strategy Coo may 
be described as follows: let {Aj}™ be independent marked unit-rate Poisson 
processes, with marks distributed uniformly on [0,1); flip (X(i),Y(i)) to 
(k, k) whenever there is an incident on Aj with mark equal to k. If N counts 
the number of unmatched coordinates under this coupling, then it is clear 
that the only jumps in N are of size -1, and that these occur at rate N. 
The coupling time Too = inf{t : At = 0} trivially satisfies 

P [foo > t] = 1 — P [at least one incident on all Aj by time t] 
= l-(l-e- t ) n . 

Therefore Cqo is indeed a maximal coupling. □ 



Furthermore, if we now let n — > oo, the distance between C(Xt) and 
again obeys a cutoff phenomenon, this time with cutoff time equal to 
logn. (This may appear surprising, since — ► \ log re as d — > oo. However, 
note that the expression on the right-hand-side of (|4.2p tends to one for all 
9 E M. as d — > oo, showing that \ logn is not the cutoff time for the limiting 
process.) 
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5. Appendix: Proofs. 

PROOF of Proposition 13.31 Equation (|3.7p is an immediate consequence 
of [CI], which implies that no matches are ever broken under c^. When 
Nt = m is even or satisfies m > 2(d — l)/(d — 2), it follows from [C2](i) and 
equation (|3.3j) that 



(d-l)X c t d (m,m-2) = ^ q(i,k)(j,i){t)^[k=Y t .{i)a=x t _{j)} = 



m — 1 



m . 



i,jeUt ijeUt 

Finally, [CI] and [C2] imply that, under c<j, the only non-zero term in equa- 
tion (|3.4p is the first sum, and so 

(d~ l)\ c t d (m,m- 1) = Y 9(i,fc)(i,fc)(*)- 

ieUt l<k<d 

Substituting the values of <l(i,k)(i,k){t) from [C2] and [C3] completes the proof. 

□ 

Proof of Lemma 13.51 When d = 2, this result follows trivially from 
the explicit representation of fd given in (0). For d > 2 however, the result 
is less obvious and requires a formal proof. We detail here the proof for the 
case when d > 4 (for which case [C2] of Definition 13.11 applies for all m > 1): 
the proof when d = 3 is similar, using the remark following equation (|5.2p 
whenever [C3] applies (i.e. when m G {1,3}). 

We begin by considering Vd(l,t). By (13. 7|) and (13. 9p it follows directly 
that for all values of d, 



(5.1) u d (l,i)=exp 



eft 
'd- 1 



Now consider (for m > 1) that part of the coupling described in [C2]. 
From (|3.8p . the total rate at which Nt can change under [C2] is given by 

. ,g. . m m(d — 2) 
A t (m, m — 2) + A t (m, m — 1) = — — - H — - — = m . 



14 



STEPHEN B. CONNOR 



Using this, along with (|3.7j) and (|3.8j) . we obtain for m > 1: 

du 



v d (m,t) = e mt + / me mu \ c t (m, m - 2)v d (m - 2, t - u) + \ c t {m, m - l)v d (m - 1, t - u) 
Jo 1 

(5.2) 

rt me —mu 

= e~ mt + / — — — [mvd(m - 2, t - u) + m(d - 2)v d (m - 1, t - u)] du . 
Jo a — I 

(A similar expression can be obtained for i)d(m,t) under [C3], noting that 
the total rate at which Nt can change in this case is md/{d — 1).) 
Define V£{m) to be the Laplace transform of ^(m, •): 

at ' 



V d a (m) = / e- at v d (m,t)dt. 
Jo 

It then follows from (|5.2|) that, for m > 1, 
(5.3) 

W = + (d-ix m + a) ("■<*<"■ - 2) + " i(<i - 2 " > "° ( " i - ") • 

and so (rearranging) 

(5.4) (m + a)(d-l)V?(m) = (d- 1) + mV£ (m - 2) +m(d- 2)t>f (m - 1) . 
Finally, with r d (m, t) = v d (m, t) — v d (m — 1, t), for a > 0, let 

/■oo 

i?°( m ) = / e _a V d (m,i)di. 



We need to show that r d (m, t) = v d (m, t) — v d (m — 1, t) > for all m > 1 
and t > 0. By the Bernstein- Widder theorem Theorem la, Chapter 
XIII. 4), this is equivalent to showing that R°[(rn) is totally monotone. We 
begin by showing that this is true when m = 1 and m = 2, and then use 
induction. From (15.11) we see that 



(5.5) ^(1) = V d a (l) 



d + a(d- 1) ' 

and is therefore totally monotone. 

Furthermore, using (|5.3p we obtain 
(5.6) 



2 + a (2 + a)(d-l) (2 + a)(d + a(d - 1)) 
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Since the product of two totally monotone functions is itself totally mono- 
tone (0), it follows that -R^(2) is also totally monotone, as desired. 

Now suppose that we have already shown R°[(m) and — 1) to be 

totally monotone, for some m > 2. Subtracting (m + a)(d — l)Vj*(m — 1) 
from both sides of (|5.4p yields 

(5.7) (m + a)(d - l)i^(m) = (d - 1) [l - aV^ (m - 1)1 - mR%{m - 1) . 
Substituting m + 1 for m in this expression we obtain 

(5.8) (m+l+a)(d-l)J?2(m+l) = (d-1) [l - aV£(m)~\-(m+l)R%{m) , 
and then subtracting (15. Tf) from (15. 8|) yields 

(m + 1 + a)(d - l)Ka(m + 1) = (d - l)a \V£(m - 1) - V£(m)\ + ?nR2(m - 



(5.9) 



- [(m + 1) - (m + a)(d - 1)] J^(m) 
mi?^(m - 1) + [m(d - 2) - 1] 1?2M . 



Since m > 2 and d > 2 we see that m(d — 2) — 1 > 0. Hence, by our 
induction hypothesis, R^(m + 1) can be expressed as the sum of two totally 
monotone functions, and so is itself totally monotone. This completes the 
proof. □ 



Proof of Theorem 13.61 In a similar fashion to the proof of Lemma l3.5l 
we show positivity of rd(m — 1, t) — rd(m, t) by showing R^{m — 1) — R°[ (m) 
to be totally monotone, again using induction in m. From (|5.5p and (|5.6p 
we see that 



1 



2 + a 

and so is totally monotone. Using (15. 9p it can be deduced that 

d - 4 

J$(2) - flj(3) = {2 + a){3 + a){d _ 1) ■ 

Since d > 4, this difference is also totally monotone. 

Now assume that R°i(m— 1) — R^{m) is totally monotone, for some m > 3. 
Substituting m — 1 for m in (15. 9p yields 
(5.10) 

(m + a)(d-l)i?5(ro) = (m- l)R$(rn-2) + [(m - l)(d - 2) - 1] i$(m-l), 
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and subtracting (|5.9j) from (|5.10j) shows that 

( m + i + a )( d _ i) [R2( m ) - R%(m + 1)] = ((m - - 2) - 2) [i£(m - 1) - i#(m)] 

+ (m-l) [S3(m-2)-R!$(m-l)] . 

Finally, since m > 3 and d > 3, (m — — 2) — 2 > and so it follows from 
our induction hypothesis that FQ{m) — -R^ ( m + 1) ^ s the sum °f two totally 
monotone functions, and hence is itself totally monotone, as claimed. □ 
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